Skip to main content

Post-Restart 502 While Docusaurus Is Building

This report documents the incident where brain.id86.net returned 502 Bad Gateway right after restarting the docusaurus container.

Issue Observed

  • Site became unavailable after sudo docker restart docusaurus.
  • Cloudflare Tunnel errors showed origin connection failure:
    • Unable to reach the origin service ... dial tcp 172.20.0.3:3000: connect: connection refused
  • docusaurus container was up, but the app was not serving yet.

Root Cause

The container startup command in the running container was:

sh -c "npm install --legacy-peer-deps && npm run build && npm run serve"

This creates a downtime window because:

  1. npm install and npm run build run first.
  2. During that time, nothing listens on port 3000.
  3. Cloudflare Tunnel forwards traffic immediately and receives connection refused.
  4. End users see 502 until npm run serve starts.

What Was Checked

# container status
sudo docker ps -a --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"

# docusaurus runtime command
sudo docker inspect -f '{{.Config.Cmd}} | {{.Path}} {{.Args}}' docusaurus

# docusaurus logs
sudo docker logs --tail 120 docusaurus

# cloudflared logs
sudo docker logs --since 2m cloudflared

# running processes in docusaurus container
sudo docker exec docusaurus /bin/sh -lc "ps aux"

Recovery Applied

To restore service immediately, serve was started manually before the long build sequence completed:

sudo docker exec -d docusaurus /bin/sh -lc "npm run serve -- --host 0.0.0.0 --port 3000"

Verification Results

# origin responding
curl -s -o /dev/null -w "%{http_code}" http://172.20.0.3:3000
# expected: 200

# serve process running
sudo docker exec docusaurus /bin/sh -lc "ps aux | grep -E 'docusaurus (build|serve)' | grep -v grep"

Outcome:

  • Origin returned 200.
  • docusaurus serve process was confirmed running.
  • Site became reachable again through Cloudflare Access.

Preventive Actions

  1. Keep production startup aligned with docker-compose.yml so it does not block on long pre-serve steps.
  2. If build/install must run, use a deployment flow that serves the previous build until the new build is ready.
  3. Add a healthcheck gate so tunnel traffic is sent only when origin is healthy.
  4. Keep this runbook command ready for emergency recovery:
sudo docker exec -d docusaurus /bin/sh -lc "npm run serve -- --host 0.0.0.0 --port 3000"